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Summary 



Consider a linear regression model with n-dimensional response vector, regression 
parameter (3 = (fii, . . . , /3 P ) and independent and identically iV(0,<T 2 ) distributed 
errors. Suppose that the parameter of interest is 9 = a 1 (3 where a is a specified 
vector. Define the parameter r = c T f3 — t where c and t are specified. Also suppose 
that we have uncertain prior information that r = 0. Part of our evaluation of a 
frequentist confidence interval for 9 is the ratio (expected length of this confidence 
interval) / (expected length of standard 1 — a confidence interval), which we call the 
scaled expected length of this interval. We say that a 1 — a confidence interval 
for 9 utilizes this uncertain prior information if (a) the scaled expected length of 
this interval is significantly less than 1 when r = 0, (b) the maximum value of the 
scaled expected length is not too much larger than 1 and (c) this confidence interval 
reverts to the standard 1 — a confidence interval when the data happen to strongly 
contradict the prior information. Kabaila & Giri, 2009, JSPI present a new method 
for finding such a confidence interval. Let /3 denote the least squares estimator of (3. 
Also let 9 = a T f3 and f = c T f3 — t. Using computations and new theoretical results, 
we show that the performance of this confidence interval improves as |Corr(0,f)| 
increases and n — p decreases. 

Key words: frequentist confidence interval; prior information; linear regression. 
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1. Introduction 



Hodges & Lehmann (1952), Bickel (1984) and Kempthorne (1983, 1987, 1988) 
present frameworks for the utilization of uncertain prior information (about the 
parameters of the model) in frequentist inference, mostly for point estimation. Such 
information can arise from previous experience with similar data sets and/or expert 
opinion and scientific background. We say that the confidence set C is a 1 — a 
confidence set for the parameter of interest 6 if its infimum coverage probability is 
the specified value 1 — a. We assess such a confidence set by its scaled expected 
volume, defined to be the ratio (expected volume of C)/(expected volume of the 
standard 1 — a confidence set). The first requirement of a 1 — a confidence set 
that utilizes the uncertain prior information is that its scaled expected volume is 
significantly less than 1 when the prior information is correct (Kabaila, 2009). 

Confidence sets that satisfy this first requirement can be classified into the fol- 
lowing two groups. The first group consists of 1 — a confidence sets with scaled 
expected volume that is less than or equal to 1 for all parameter values, so that 
these dominate the standard 1 — a confidence set. Examples of such confidence sets 
are the Stein-type confidence interval for the normal variance (see e.g. Maata & 
Casella, 1990 and Goutis & Casella, 1991) and Stein-type confidence sets for the 
multivariate normal mean (see e.g. Stein, 1962, Berger, 1980, Casella & Hwang, 
1983, Tseng & Brown, 1997, Efron, 2006 and Saleh, 2006). The second group con- 
sists of 1 — a confidence sets that satisfy this first requirement, when dominance 
of the usual 1 — a confidence set is not possible (the scaled expected volume must 
exceed 1 for some parameter values). This second group includes confidence inter- 
vals described by Pratt (1961), Brown et al (1995) and Puza & O'Neill (2006ab). 
This second group also includes 1 — a confidence sets that satisfy the additional 
requirements that (a) the maximum (over the parameter space) of the scaled ex- 
pected volume is not too much larger than 1 and (b) the confidence set reverts to 
the usual 1 — a confidence set when the data happen to strongly contradict the 
prior information. Confidence intervals that utilize uncertain prior information and 
satisfy these additional requirements have been proposed by Farchione & Kabaila 
(2008) and Kabaila & Giri (2009). The purpose of the present paper is to analyse 
further interesting properties of the Kabaila & Giri (2009) confidence interval. 

Consider the linear regression model Y = X(3 + £, where Y is a random n- 
vector of responses, X is a known n x p matrix with linearly independent columns, 
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(3 = . . . , /3p) is an unknown parameter vector and e ~ N(0, o 2 I n ) where a 2 is 
an unknown positive parameter. Suppose that the parameter of interest is = a T (3 
where a is specified p- vector (o ^ 0). The inference of interest is a 1 — a confidence 
interval for 9. Define the parameter r = c T f3 — t where the vector c and the number 
t are specified and a and c are linearly independent. Also suppose that we have 
uncertain prior information that r = 0. 

Part of our evaluation of a frequentist confidence interval for 9 is the ratio 

(expected length of this confidence interval) 
(expected length of standard 1 — a confidence interval) ' 

where the standard 1 — a confidence interval is obtained by fitting the full model to 
the data. We call this ratio the scaled expected length of this confidence interval. 
We say that a 1 — a confidence interval for 9 utilizes this uncertain prior information 
if the following three conditions hold. The first condition is that the scaled expected 
length of this interval is significantly less than 1 when r = 0. The strong admissibility 
of the standard 1 — a confidence interval, as proved by Kabaila, Giri & Leeb (2010), 
implies that the maximum (over the parameter space) of the scaled expected length 
of this interval must be greater than 1. The second condition is that this maximum 
is not too much larger than 1. The third condition is that this confidence interval 
reverts to the standard 1 — a confidence interval when the data happen to strongly 
contradict the uncertain prior information that r = 0. 

Kabaila and Giri (2009) present a new method for finding such a confidence 
interval. For convenience, we refer to the confidence interval found by this method 
as the KG confidence interval. This method is described briefly in the next section. 
Let f3 denote the least squares estimator of (3. Also let denote a 1 (3 and f denote 
c T /3 — t. We elucidate the dependence of the properties of this confidence interval 
on Corr(@, f) and n — p. Note that Corr(0, f) is determined by a, c and X, so that 
it does not depend on the unknown parameters f3 and a 2 . 

In Section 3, we consider the dependence of these properties on n — p, when 
Corr(0, f) = 0. We prove that the KG confidence interval is centred at and is 
equi-tailed. Using computations and a new theoretical result, we show that that the 
KG confidence interval (a) utilizes the uncertain prior information for small n—p 
and (b) loses the ability to utilize this uncertain prior information as n — p increases. 
Let a 2 denote the usual unbiased estimator of a 2 , obtained by fitting the full model. 
Our explanation for this finding is that when Corr(0, f) = 0, the ability of the KG 
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confidence interval to utilize the uncertain prior information comes from the ability 
to estimate a 2 with greater accuracy than by using a 2 , particularly when n — p is 
small. 

In Section 4, we consider the dependence of the properties of the KG interval 
on n - p, when Corr(0,f) 7^ 0. We show, through computational results, that 
the KG confidence interval utilizes the uncertain prior information irrespective of 
how large n — p is, with increasing ability to do so when |Corr(0,f)| is large. Our 
interpretation of this finding is that Corr(0, f) 7^ provides another source of the 
ability to utilize the uncertain prior information. 

Our overall conclusion is that there are two sources for the ability of a 1 — a 
confidence interval for 9 to utilize the uncertain prior information. The first of these 
sources is a non-zero Corr(0, f). The second of these sources is the ability, for small 
and medium n — p, to estimate a 2 with more accuracy. The performance of the KG 
confidence interval improves as |Corr(0,f)| increases and n — p decreases. 

The scaled expected length of the KG interval is a function of the parameter 
7 = rj ^/Var(f). Figure 2 is a plot of the squared scaled expected length (which is 
an even function of 7) as a function of 7 for this interval, with tuning parameter 
i = 0.15, for the case that Corr(0, f) = 0.8165, n-p=landl-a = 0.95. When 
the prior information is correct (i.e. when 7 = 0), we gain a great deal since the 
squared scaled expected length is 0.6960. The maximum value of the squared scaled 
expected length is only 1.0626. This confidence interval reverts to the standard 1 — a 
confidence interval when the data strongly contradict the uncertain prior information 
that r = 0. This is reflected by the fact that the squared scaled expected length 
converges to 1 as 7 — > 00. 

2. Description of the confidence interval of Kabaila & Giri (2009) 

Let vn = Var(0)/<7 2 , v 2 2 = Var(f)/(X 2 and v u = Cov(0, r)/a 2 The standard 
1 — a confidence interval for 9 is / = [© — t{n — p)^/vll^r, © + t{n — p) ^/v\i&\ , 
where the quantile t(m) is defined by P(—t(m) < T < t(m)) = 1 — a for T ~ t rn 
and a 2 = (Y - X@) T (Y - Xj3)/(n - p). 

Henceforth, suppose that b : R — > R is an odd function and s : [0, 00) — > (0, 00) 
are measurable functions. We use the notation [a ± b] for the interval [a — b, a + b] 
(b > 0). For each b and s, define the following confidence interval for 9 



Let 7 = t I ^/Var(f) = rj {p^/v-zi) and p = Corr(@, f) = i> 12 / ^11^2- Note that 
p = (a T (X T X)- 1 c)/^a T (X T X)- 1 ac T (X T X)- 1 c and so does not depend on 
the unknown parameters /3 and a 2 . For given (b,s,p), the coverage probability 
P(Q & J(b, s)) is an even function of 7, which we denote by 0(7; 6, s, p). The scaled 
expected length of J(b, s) is (expected length of J(b,s)) /(expected length of I) and 
is an even function of 7 for given s, which we denote by e(7; s). 

Define the class B to consist of the odd functions b : R — > K. that satisfy b(x) = 
for all \x\ > d, where d is a (sufficiently large) specified positive number. Also define 
the class S to consist of the functions s : [0, 00) — > (0, 00), where s(x) = t(n — p) 
for all x > d. Stated briefly, we find the 1 — a confidence interval for 9 that utilizes 
the uncertain prior information that r = as follows. Find smooth functions b G B 
and s G S such that (a) the minimum of 0(7; b, s, p) over 7 is 1 — a and (b) 

/oo 
(e(7;s)-l)d 7 +(e(0;s)-l) (1) 
-oo 

is minimized, where £ is a specified nonnegative tuning parameter. The larger the 
value of £, the smaller the relative weight given to minimizing e( 7 ; s) for 7 = 0, as 
opposed to minimizing e( 7 ; s) for other values of 7. Since we require that b G B 
and s G S, this confidence interval reverts to the standard 1 — a confidence interval 
/ when the data happen to strongly contradict the uncertain prior information 
that r = 0. The tuning parameter £ and the functions b and s are chosen by the 
statistician prior to looking at the observed response vector y. Further details of 
the method used to make this choice are provided in Appendix A. 

Example 1 (2 3 factorial experiment without replication) 

Consider a 2 3 factorial experiment without replication. Let Y denote the response 
and let 27, X2 and £3 denote the coded levels for each of the 3 factors, where the 
coded level takes either the value —1 or 1. We will assume the model 

Y = A) + PlXi + P2X2 + P3X3 + /3l237^2 + Pl3XlX3 + P23X2X3 + /3l23^1^2^3 + £ 

where /3 , fa, fa, fa, fa 2 , fas, faz, P123 are unknown parameters and e ~ N(0,a 2 ), 
where a 2 is an unknown positive parameter. 

For factorial experiments it is commonly believed that higher order interactions 
are negligible (see e.g. Mead (1988, p. 368) and Hinkelman & Kempthorne (1994, 
p.350)). Indeed, this type of belief is the basis for the design of fractional factorial 
experiments. Suppose that ^123 = and that we have uncertain prior information 
that P12, faz and fas are all zero. Thus n — p = 1. We consider the particular 
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case that the parameter of interest interest 9 is the contrast (E(Y) for (xi,X2, £3) = 

for (x 1 ,x 2l x 3 ) = (1,-1,1)). In other words, 9 = 2/3 123 - 2/3 13 + 
2/3 2 3 — 2/3 3 . Since we assume that P123 = 0, 9 = — 2/?i 3 + 2/3 2 3 — 2/3 3 . 

Let r = P23 — 013- The uncertain prior information that (3u, P13 and P23 are all 
zero implies the uncertain prior information that r = 0. Note that Corr(0,f) = 
^/2/3 = 0.816496. Fi gure 1 is a plot of the functions b and s for the KG 1 — 
a confidence interval for 9 when Corr(0,f) = 0.816496, n — p — 1, 1 — a — 
0.95, £ = 0.15, d = 40, the knots of the cubic spline b (in the interval [0,d]) at 
0, 15, 18, 21, 24, 27, 30, 40 and the knots of the cubic spline s (in the interval [0, d]) 
at 0,3,6,9,12,15,30,40. To an excellent approximation, the coverage probability 
of this confidence interval is 0.95 for all 7. The minimum coverage probability of 
this confidence interval is 0.94992. Figure 2 is a plot of the squared scaled expected 
length of this confidence interval as a function of 7. When the prior information is 
correct (i.e. when 7 = 0), we gain a great deal since the squared scaled expected 
length is 0.6960. For 7 larger than 15, the squared scaled expected length is a 
decreasing function and approaches 1 as 7 — > 00. 
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Figure 1: Plots of the functions b and s for the KG 1 — a confidence interval for 9 
when Corr(0,f) = 0.816496, n-p = 1, 1 -a = 0.95, £ = 0.15, d = 40 and the knots 
of the cubic splines b and s (in the interval [0,d]) are at 0,15,18,21,24,27,30,40 
and at 0, 2, 4, 6, 8, 10, 25, 40, respectively. 
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Figure 2: Plot of the squared scaled expected length e 2 (7; s) (as a function of 7 = 
t/ {(?y/v22)) for the KG 1 — a confidence interval for 9 when Corr(@, f) = 0.816496, 
n — p = 1, 1 — a — 0.95, £ = 0.15, d — 40 and the knots of the cubic splines b and 
s (in the interval [0, d]) are at 0, 15, 18, 21, 24, 27, 30, 40 and at 0, 2, 4, 6, 8, 10, 25, 40, 
respectively. 



3. Performance of the KG interval for Corr(©, f) = 



In this section we consider the case that Corr(6, f) = 0. For notational conve- 
nience, we use b = to denote the function b : R — > R satisfying b(x) =0 for all 
x G R. Corollary 1 (stated later in this section) shows that choosing 6 = does not 
lead to any loss in the performance of the KG confidence interval for 9. We therefore 
make the restriction that 6 = 0. This implies that the KG confidence interval has 
the form 

'e ± ^a S (J^=]], (2) 



so that it is centred at G. Theorem 2 shows that the resulting KG confidence interval 
is equi-tailed. As illustrated by Figure 3, computations show that the performance 
of this confidence interval is good when n—p is small, but degrades as n—p increases 
and disappears as n — p — > 00. 
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Figure 3: Plots of the squared scaled expected length e 2 (7; s) (as a function of 
7 = t/(ct y/v22)) for the KG 1 — a confidence interval for 9 when Corr(0,f) = 0, 
1 — a = 0.95, £ = 0.15, b = 0, d = 12 and the knots of the cubic spline s (in the 
interval [0, d]) are at 0,1.5,3,4.5,6,7.5,9,10.5,12. The values of n — p are 1, 2, 3 
and 4. 

Theorem 3 proves the truth of this computational finding. The explanation for this 
finding is that when Corr(0,f) = 0, the ability of the KG confidence interval to 
utilize the uncertain prior information comes from the ability to estimate o 2 with 
greater accuracy than by using a 2 . This ability is significant when n — p is small, 
but decreases as n — p increases and disappears as n — p — > oo. 

The following theorem shows that for fixed function s, the coverage probability 
of the confidence interval J(b, s) is maximized by setting 6 = 0. 

Theorem 1. Suppose that Corr(Q,f) = and that the function s E S is given. 
For each 7 e R, the coverage probability 0(7; b, s, p) is maximized with respect to the 
function b G B, by setting 6 = 0. 

This theorem is proved in Appendix C. The following result, which is a corollary of 
Theorem 1, shows that choosing 6 = does not lead to any loss in the performance 
of the KG confidence interval. 
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Corollary 1. Suppose that Corr(Q,f) = 0. Suppose that B* is a subset of B that 
includes the function b = 0. Also suppose that S* is a subset of S. The infimum 
over (b, s) G B* x S* of ([!]), subject to the coverage constraint 

0(7; b, s, p) > 1 — a for all 7 G E, (3) 

is equal to the infimum over s G S* of ([I]), subject to this constraint, when 6 = 0. 

This corollary is proved in Appendix D. 

The following theorem implies that if b = then the KG confidence interval is 
equi-tailed. 

Theorem 2. Suppose that Corr(0,f) = and that b = 0. Then the confidence 
interval J(b, s) for 9 is equi-tailed. 

This theorem is proved in Appendix E. The following theorem shows that the per- 
formance of this confidence interval degrades as n — p increases and disappears as 
n — p — )• 00. 

Theorem 3. Suppose that Corr(Q,f) = and that b = 0. Define 
S = {s G S : 0(7; b, s, p) > 1 — a for all 7}. 

Then 

inf e(7 = 0; s) > 1 - r? n _ p 

sG<S 

where {r] m } is a sequence of positive numbers converging to as m — > 00. 

This theorem is proved in Appendix F. Although lengthy, this proof is quite straight- 
forward and elementary. 

4. Performance of the KG interval for Corr(©,r) 7^ 

In this section we consider the case that p = Corr(G), f) 7^ 0. For n — p large, a 2 
estimates a 2 with great accuracy and so the ability of the KG confidence interval 
to utilize the uncertain prior information does not come from the estimation of a 2 
with more accuracy. This ability comes instead from the correlation between and 
f . The computational results shown in Figure 4 for n — p = 200 illustrate this 
point well. For ease of comparison, Figures 2, 3 and 4 have the same limits on their 
horizontal and vertical axes. 
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Figure 4: Plots of the squared scaled expected length e 2 (7; s) (as a function of 
7 = t/(<t a /^22)) for the KG 1 — a confidence interval for 9 when n — p = 200, 
1 — a = 0.95, £ = 0.15, d = 6 and the knots of the cubic splines b and s (in the 
interval [0, d}) are at 0,1,2,3,4,5,6. The values of p = Corr(@,f) are 0.8, 0.6, 0.4 
and 0.2. 

5. Remarks 

Remark 5.1 It might be hoped that a confidence interval constructed in the following 
way will be able to utilize this uncertain prior information. Carry out a preliminary 
test of the null hypothesis that r = against the alternative hypothesis that r ^ 0. 
If this null hypothesis is rejected then we use the standard 1 — a confidence interval 
for 9. If, on the other hand, this null hypothesis is accepted then we use the standard 
1 — a confidence interval for 9, assuming that r = 0. We call this the naive 1 — a 
confidence interval for 9. A computationally-convenient formula for the coverage 
probability of this confidence interval is given in Theorem 3 of Kabaila & Giri 
(2009b). The minimum coverage probability of this confidence interval can be far 
below I — a. Kabaila (1998) increases the half-width of this confidence interval, when 
this null hypothesis is accepted, by the smallest possible value such that the adjusted 
interval has minimum coverage 1 — a. He shows that such confidence intervals can 
utilize the uncertain prior information that r = when n — p is small. However, 
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this adjusted confidence interval has the disadvantages that (a) it is obtained by an 
ad hoc adjustment, (b) there may be far better adjustments and (c) the endpoints 
of this interval are discontinuous functions of the data. Kabaila & Giri (2009a) 
motivate the confidence interval analysed in the present paper by greatly "loosening 
up" up the form of the naive 1 — a confidence interval for 9. 

Remark 5.2 If we knew (with certainty) that r = then the centre of the confidence 
interval for 9 would be 



where b(x) = px. This fact provides a hint that the following results may be true: 

(Rl) If p = then there is no loss in the performance of the KG interval if we make 
the additional constraint that 6 = 0. 

(R2) If p > then there is no loss in the performance of the KG interval if we make 
the additional constraint that b > for all x > 0. 

(R3) If p < then there is no loss in the performance of the KG interval if we make 
the additional constraint that b < for all x > 0. 

As stated in Section 3 and proved in Appendix C, the result (Rl) is true. Very 
extensive numerical computations carried out by the authors suggest that the results 
(R2) and (R3) are also true. For example, the top panel of Figure 1 of the present 
paper and the top panel of Figure 2 of Kabaila & Giri (2009a) are consistent with 
the results (R2) and (R3), respectively. This strongly suggests that, for all possible 
data values, the centre of the KG interval cannot be obtained by a shift from in 
the opposite direction to @. 

Remark 5.3 Suppose that we wish to construct an equi-tailed 1 — a confidence 
interval for 9 that utilizes the available uncertain prior information. As the following 
two examples show, consideration of the case that Corr(0,f) = provides us with 
a method of constructing such a confidence interval in the context of certain types 
of prior information. 

Example 2 (2 3 factorial experiment without replication, equi-tailed con- 
fidence interval for 6) 

Consider the same model, uncertain prior information and parameter of interest 9 as 
delineated in the first two paragraphs of the description of Example 1. Suppose that 




(4) 
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we wish to find an equi-tailed 1 — a confidence interval for 9 that utilizes this prior 
information. We find such a confidence interval by letting r = /3 12 . This uncertain 
prior information implies the uncertain prior information that r = 0. Note that 
Corr(G), f ) = 0, so that we can obtain the performance depicted in the top left-hand 
plot of Figure 3. 

Example 3 (prior information about a 2-dimensional parameter vector, 
equi-tailed confidence interval for 0) 

Consider the model and parameter of interest 9 described in the Introduction. Sup- 
pose that p > 2 and n — p is small. Let the 2-dimensional parameter vector if) be 
defined to be C T (3 — t, where C is a specified p x 2 matrix with linearly indepen- 
dent columns and t is a specified 2-vector. Suppose that a does not belong to the 
linear subspace spanned by the columns of C. Also suppose that previous experi- 
ence with similar data sets and/or expert opinion and scientific background suggest 
that tp = 0. In other words, suppose that we have uncertain prior information that 
if> = 0. Let * = C T /3 - t. 

Suppose that our aim is to find an equi-tailed 1 — a confidence interval for 9 
that utilizes this uncertain prior information. If Cov(0, \&j) = then we can find 
such a confidence interval by letting r = if>i (i = 1,2). If, on the other hand, 
Cov(0, if>i) and Cov(0, ^/ 2 ) 7^ then we can find such a confidence interval by 
letting 

T= , cov(Mi) 
1 Cov(e,^ 2 ) 2 

and noting that Corr(@, f) = 0, where 

, = - covfe,^)- 
Cov(e, * 2 ) 

Remark 5.4 As stated in Appendix A, we have chosen the functions b and s to 
be cubic splines in the interval [0,d]. Other choices of parametric forms for these 
functions are also possible. For example, one could choose these functions to be 
piecewise cubic Hermite interpolating polynomials in this interval. 

Remark 5.5 Instead of minimizing the criterion (TTJ) (subject to the coverage con- 
straint) one could minimize the following criterion (subject to the same coverage 
constraint) 

poo poo 

U (e( 7 ;s)-l)d 7 + / (e(7;s)-l)0(7;v)d7 (5) 
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where 0(7; v) denotes the N(0, v 2 ) probability density function and v is a small 
positive number. However, we expect that the use of fl5]) as an objective function will 
lead to confidence intervals that are close to the corresponding confidence intervals 
obtained by using ([IJ as the objective function. 

Remark 5.6 Instead of minimizing the criterion (jTJ), subject to the coverage con- 
straint, we may proceed as follows. We minimize e(7 = 0;s), subject to both this 
coverage constraint and the constraint that max 7 e(7;s) < £, where I is specified 
number satisfying i > 1. Theorems 1, 2 and 3 are relevant to this procedure. Also, 
the obvious analogue of Corollary 1 holds for this procedure. The performance of 
the confidence interval that results from this procedure improves as |Corr(@,f)| 
increases and n — p decreases. Figure 5 shows the performance of the confidence 
interval resulting from this procedure when Corr(6, f) = 0.816496, n — p = 1, 
1 — a = 0.95 and I = 1.0308, so that max^ e(7; s) is the same as in Figure 2. 
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Figure 5: Plot of the squared scaled expected length e 2 (7; s) (as a function of 7 = 
f/(<Jy/v22)) for the 1 — a confidence interval for 6 when Corr(G,f) = 0.816496, 
n — p — 1, 1 — a — 0.95, £ = 1.0308, d — 50 and the knots of the cubic splines b and 
s (in the interval [0, d]) are at 0, 15, 18, 21, 24, 27, 30, 50 and at 0, 2, 4, 6, 8, 10, 25, 50, 
respectively. 

Remark 5.7 In the example presented at the end of Section 2, the uncertain prior 
information is that flu, P13 and P23 are all zero. As noted in the description of 
this example, this implies the uncertain prior information that r = P23 — 0is is 
zero. By extending the work of Kabaila & Giri (2009a) to the case of uncertain 
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prior information that a vector parameter is zero, it should be possible (using the 
methods of Kabaila & Farchione, 2012) to construct a confidence interval for 9 
that utilizes the original prior information (that /? 12 , /?i3 and ^23 are all zero) more 
effectively. 

6. Conclusion 

Using computations and new theoretical results, we have shown that the perfor- 
mance of the Kabaila & Giri (2009a) confidence interval for 9 improves as |Corr(0, f) | 
increases and n — p decreases. The improvement in performance of this confidence 
interval as |Corr(0, f) | increases and n — p decreases, is illustrated by Figures 2, 3 
and 4. 

Appendix A: Computation of the KG confidence interval 

In addition to requiring that b G B and s G S, we require that the functions b 
and s are continuous. For computational tractability, b and s need to be restricted 
further. Kabaila & Giri (2009a) take b and s to be cubic splines in the interval [0, d]. 
We restrict the functions b and s even further. We require the function s to be uni- 
modal on the interval [0, d]. In other words, we require that s satisfies the condition 
that there exists q G (0, d) such that s(x) is (a) a strictly increasing function of 
x G [0, q] and (b) a strictly decreasing function of x G [q, d\. If Corr(6,f) ^ then 
the function b is also required to be unimodal on the interval [0, d}. Let B* and S* 
denote the subsets of B and S, respectively, that satisfy these requirements. 

For judiciously chosen values of d, £ and the knots of the cubic splines for b and 
s in [0,o(], we carry out the following computational procedure. 

Computational Procedure: Compute b G B* and s6 5* such that (a) the minimum 
of the coverage probability c(j;b, s,p) over 7 is 1 — a and (b) the criterion <Q is 
minimized. Theorem 1 of Kabaila & Giri (2009a) provides computationally conve- 
nient expressions for 0(756, s,p) and e(j;s). Discussion 5.6 of this paper provides 
some further information about this computation. A simplified expression for ([1]) 
is provided in Appendix B. The resulting confidence interval is assessed using the 
following plots: plots of the functions b and s on the interval [0, d] and plots of 
the coverage probability c(j;b,s,p) the squared scaled expected length e 2 (7;s), as 
functions of 7 > 0. 
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Based on these plots, we choose d, £ and the knots of the cubic splines for b and s 
in [0,d], so that the confidence interval has not only desirable coverage probability 
and scaled expected length properties, but also the functions b and s have desirable 
properties, such as smoothness. We refer to the resulting confidence interval as the 
KG 1 — a confidence interval. 



Appendix B: Simplified expression for the criterion (fTJ) 

In this appendix we provide a simplified expression for (JT]). Define W = a jo. 



Note that W has the same distribution as ■sj Q/(n — p) where Q ~ Xn- P - Let fw 
denote the probability density function of W . According to (8) of Kabaila & Giri 
(2009a), dU is equal to 

— — — / / (s(x) -t(n-p)) (£ + <p(wx))dxw 2 f w (w)dw. 

t{n-p)E{W) J J 

where <fi denotes the N(0, 1) probability density function. Now this is equal to 



2 



£ / (s(x) — t(n — p)) dx + / (s(x) — t(n — p)) / 4>{wx)w 2 fw{w)dwdx 



t(n-p)E(W) V 7o Jo Jo 

By the following lemma, this is equal to 

m \ (m/2)+1 
(s{x)-t{m)) I dx, 



t(m)E(W)J v v ' v " \ y/27r\x 2 + m 
where m = n — p. 
Lemma 1. 



o 



1 / m \(™/ 2 ) +1 
4>{wx)w f w (w)dw= -j= I J ■ (6) 



Proof. Note that fw( w ) — 2mw f m (mw 2 ) , where f m denotes the Xm probability 
density function. Substituting the expressions for and fw into the left hand side 
of (|6]), we find that this is equal to 

O m (m/2) roc / 1 \ 

w exp I —-(to + x )w dw 



\/2^r(m/2)2 m / 2 Jo V 2 

By (A2.1.3) of Box Sz Tiao (1973), this is equal to the right hand side of 



□ 
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Appendix C: Proof of Theorem 1 



In this appendix, we prove Theorem 1. Suppose that Corr(0,f) = and that 
the function s G S is given. Fix 7 e R. 

Maximizing 0(7; 6, s, p) with respect to 6 G B is equivalent to minimizing 1 — a — 
0(7; b, s, p) with respect to b G B. Define 

/c(x, 1/7,7) = $(6(x)i/7 + — &(b(x)w — 

k\w) = 2$(t(n-p)w) - 1, 

where <£> denotes the N(0, 1) distribution function. According to p. 307 of Kabaila, 
Giri and Leeb (2010), 

I- a- c(r, b, s,p) = - (rx(b, s, 7) + r 2 (b, s, 7)) 

where 

poo pd 

77(6, 5,7)= / / (A;(x,w,7) — k\w)) 4>{wx — 7) w fw{w) dx dw 
Jo Jo 

poo pd 

r 2 (b, s, 7)= / / (k(— x, w, 7) — k\w)) 4>{wx + 7) w fw( w ) dx dw. 
Jo Jo 

Thus, minimizing 1— a— 0(7; b, s, p) with respect to b G B is equivalent to maximizing 
77(6, s, 7) + r 2 (6, s, 7) with respect to b G B. 

According to p. 309 of Kabaila, Giri & Leeb (2010), for fixed s > and w > 0, 
§(bw + su>) — $(6u> — sw) is maximized with respect to b G R at b = 0. Thus 
$(6(x)u7 + s(a;)w) — $(6(a;)w — s(x)w) is, for each x G [0, of] and w > 0, maximized 
with respect to 6(x) G R at b(x) = 0. Since 4>{wx — j)wfw(w) > for all x G [0, d] 
and it) > 0, r\(b, s, 7) is maximized with respect to the function 6 G S by setting 
6 = 0. A similar argument shows that ^(6, s, 7) is maximized with respect to the 
function b G B by setting 6 = 0. Thus, ri(6, s, 7) + r 2 (6, 5,7) is maximized with 
respect to the function b G B by setting 6 = 0. 

Appendix D: Proof of Corollary 1 

Suppose that Corr(0, f) = 0. Suppose that B* is a subset of B that includes the 
function 6 = 0. Also suppose that S* is a subset of S. 

The infimum over (6, s) G i3* x5* of (pQ), subject to the constraint ([3]), is less than 
or equal to the infimum over s G <S* of (Q]) , subject to this constraint, when 6 = 0. We 
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complete the proof by contradiction. Suppose that the infimum over (6, s) G B* x S* 
of ([1]), subject to the constraint (j3J), is less than to the infimum over s G S* of (JT]), 
subject to this constraint, when 6 = 0. Thus there exists (6', s') G B* x S* such 
that the constraint ([3]), evaluated at (b, s) = (b', s'), is satisfied and (jTJ, evaluated at 
(6, s) = (6', s'), is less than the infimum over s G S* of dTJ, subject to this constraint, 
when 6 = 0. 

By Theorem 1, the following is true. If we let b = then (6, s) = (6, s') satisfies 
the constraint ([3]). Also, ([TJ, evaluated at (b,s) = (b,s'), is equal to (JTJ, evaluated 
at (6, s) = (b',s f ). We have established a contradiction. 

Appendix E: Proof of Theorem 2 

In this appendix, we prove Theorem 2. Suppose that Corr(0,f) = and that 
6 = 0. The confidence interval J{b,s) has the form ([2$. Let G = (0 — 0)/(a y/vu) 
and if = Tj(o^Jvyi). Note that G and ii are independent random variables and 
G~ A(0, 1). Now 

p(s<e-v=^-JL))^(c>^(f)) (7) 

Also, 



where G = -G. Thus © = ©• 

Appendix F: Proof of Theorem 3 

Suppose that Corr(0, f) = and that 6 = 0. Theorem 3 provides a lower bound 
for e(7 = 0; s) — 1, subject to the constraints that s 6 5 and 0(7; b,s,p) > 1 — a 
for all 7. We prove this result using the framework of compromise decision theory 
(Kempthorne, 1983, 1987, 1988). Specifically, we use Theorem 2.2 (a) of Kabaila & 
Tuck (2008) to prove this result. 

Define Ri(s; 7) = e(7; s) — 1. Also define 7Ti to be the unit step function. Thus 

Ri(s; 7) ^1(7) = e(7 = 0; s) - 1. 

Now define R 2 (s] / y) = 1 — a — 0(7; 6, 5, p). Define tt 2 to the unit step function. Now 
define 

R^s; 7) tM T ) + (1 - A) / i? 2 ( S ; 7) <M T ), 
00 J —00 
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where < A < 1. Let m = n — p. For each positive integer m, we will define 
A(m) G (0, 1) and we will find s that minimizes g(s; A(m)) with respect to s G 5. 
Denote this minimizing value of s by s\( m )- We will also note that 



supi? 2 (sA(m);7) = 

7 



and that 



/oo 
#2(sA(m)5 7) *r 2 (7) (9) 
-oo 

converges to as m — )■ oo. Theorem 2.2 (a) of Kabaila & Tuck (2008) implies that 

f°° f°° 1 - A(m) 

inf/ J R 1 (s;7)rf7r 1 (7) > / i?i(s A(m) ; 7) ^1(7) — u m , 

ses J -oo J-oo A(m) 

for each positive integer m. In other words, 

inf e( 7 = 0; s) - 1 > e( 7 = 0; s A(m) ) - 1 - l —^±u m . (10) 

We will then note that e(y = 0;s X ( m )) > 1 and show that f m (l — A(m))/A(m) 
converges to 0, as m — > 00. 

It follows from Theorem 1 (b) of Kabaila & Giri (2009a) that 

e( 7 = 0; s) - 1 = / (s(x) - t(m)) / 0(wx) w 2 dw dx, 

t{m)E{W) J Q J q 

where denotes the N(0, 1) probability density function. It follows from p. 307 of 
Kabaila, Giri & Leeb (2010) that 1 — a — 0(7; b, s, p) is equal to 

pd poo 

— 2 / / (&(s(x)w) — &(t(m)w)) (4>{wx — 7) + 4>{wx + 7))u> fw{w) dw dx, 
Jo Jo 

where $ denotes the N(0, 1) distribution function. Thus 
2 f d f°° 

g(s; A) =x t ( m ^ E ( W ^ J o ( s ( x ) ~ ^ m )) J o ^ wx ) w2 fw(™) dw dx 

I'd pOD 

-4(1 -A) / / ($(s(x)w) - $(t(m)w)) <f>(wx) w f w (w) dwdx. 
Jo Jo 

Minimizing this function with respect to s G S is equivalent to minimizing 

f d ( A f°° 
~ 9{S;X) = J \ t{m)E(W) I <K™)rffw{w)dw 8 (z) 

— 2(1 — A) J &(s(x)w) <f>(wx) w fw(w) dw^jdx 
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with respect to s G S. We find a minimizing value of s G S as follows. For each 
x G [0,0?), we minimize 

A f°° \ f°° 

2(1 - \)t{m)E(W) J ^ WX ^ W * f w<yW ^ dw ) t ~ J ®( tw } ^ WX ^ W dw 

(11) 

with respect to t > and then set s(x) equal to this minimizing value. The derivative 
of ( 1TT1) with respect to t is equal to 

' 'vox) w 2 fw( w ) dw — / (f)(tw) cj)(wx) w 2 fw( w ) dw. (12) 

J 



2(1 - A)t(m)£(W0 
We simplify this expression using the following lemma. 



Lemma 2. 



4>{tw)(j){wx)w fw(w) dw = — 



1 / m 



2tt \t 2 + x 2 + m 

Proof. Note that 

4>(tw)<fi(wx) = — cf>(wx) 

'27T 



(m/2)+l 



where x = Vt 2 + x 2 . Hence 



1 

4>(tw)(j)(wx)w 2 fw(w) dw = —= I 4>(wx)w 2 fw(w) dw 

V 27T 







1 / \ (™/2)+l 

1 / m x 



27r V ^ 2 + x 2 + m 



by Lemma 1. 

By this lemma and Lemma 1 (stated in Appendix B), (fT2"|) is equal to 

\ r~ / \ (m/2)+l / \ (m/2)+l x 



□ 



27T V (1 — X)t(m)E(W) \ 2 \x 2 + mj \t 2 + x 2 + m 



(13) 



This is an increasing function of t, that approaches a positive number as t — > oo. 
Define A(m) to be the solution for A G (0, 1) of 



, l/((m/2)+l) 

2 (1 - A)t(m)E(M/) x 

7T A 



- 1 = t(m). 



Henceforth, suppose that A = A(m). Note that (jTBl) approaches a negative number 
as 1 1 0. Thus, for each x G [0, d), we find the value of t > that minimizes ffTTj) by 



20 



solving (|T3|) =0 for t > 0. For each x G [0, d), this solution is t = y/l + (x 2 /m) t(m). 
Thus 

/ \ )\ l + —t(m) forxe\0,d) 
{t(m) for x > d. 

Now 

sup R 2 (s X {m) ; 7) = 1 - ol - inf c(r, b = 0, s A(m) , p = 0) . 

7 T 

Since SA( m )(^) > t(m) for all a; > 0, the following easily-proved lemma implies that 

supi? 2 (sA( m );7) < 0. (14) 

7 

Lemma 3. Suppose that b : R — > E ; s : [0, 00) — >■ (0, 00) and s : [0, 00) — > (0, 00) 
are measurable functions. Also suppose that s(x) > s(x) for all x > 0. Then 
c(r, b, S, p) > c(r, b, s, p) for all 7. 

The following lemma implies that 0(7; 6 = 0, sx( m ),p = 0) — > 1 — a, as 7 — » 00. It 
follows from (fT4"j) that 

supi? 2 (sA(m);7) = 0. 

7 

Lemma 4. Suppose that the positive integer m, b E B, s E S and p G (—1, 1) are 
given. Then 0(7; b, s, p) — > 1 — a, as 7 — > 00. 

Proof. It is an immediate consequence of a result stated on p. 3428 of Kabaila & Giri 
(2009a) that 

poo pdw 

1 0(7; 6, s,p) — (1 — a) I < / / <p(h — j) dh fw(uj) dw 

JO J-dw 

where fw denotes the probability density function of W = a jo. The result is a 
straightforward consequence of this inequality. 

□ 

Define v m by and note that 

v m = 0(7 = 0; 6 = 0, s X (m),P = 0) - (1 - a). 

By Lemma 3, 



(7 = 0;6 = 0,s A(m) ,p = 0) < c 7 = 0; b = 0, s = ^1 + (d?/m)t(m),p = 
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where s = yl + (d 2 /m)t(m) denotes the function s that satisfies s(x) = a/1 + (d 2 /m)t(m) 



for all x G R. Thus u m j. as m — > oo. As noted earlier, f lTUj) holds. Since 
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